Forest Rescoring: Faster Decoding with Integrated Language Models

نویسندگان

  • Liang Huang
  • David Chiang
چکیده

Efficient decoding has been a fundamental problem in machine translation, especially with an integrated language model which is essential for achieving good translation quality. We develop faster approaches for this problem based on k-best parsing algorithms and demonstrate their effectiveness on both phrase-based and syntax-based MT systems. In both cases, our methods achieve significant speed improvements, often by more than a factor of ten, over the conventional beam-search method at the same levels of search error and translation accuracy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Investigations on Phrase-based Decoding with Recurrent Neural Network Language and Translation Models

This work explores the application of recurrent neural network (RNN) language and translation models during phrasebased decoding. Due to their use of unbounded context, the decoder integration of RNNs is more challenging compared to the integration of feedforward neural models. In this paper, we apply approximations and use caching to enable RNN decoder integration, while requiring reasonable m...

متن کامل

Forest-based Algorithms in Natural Language Processing

FOREST-BASED ALGORITHMS IN NATURAL LANGUAGE PROCESSING Liang Huang Supervisors: Aravind K. Joshi and Kevin Knight Many problems in Natural Language Processing (NLP) involves an efficient search for the best derivation over (exponentially) many candidates. For example, a parser aims to find the best syntactic tree for a given sentence among all derivations under a grammar, and a machine translat...

متن کامل

A Search in the Forest: Efficient Algorithms for Parsing and Machine Translation based on Packed Forests A DISSERTATION PROPOSAL in Computer and Information Science

Many problems in Natural Language Processing (NLP) involves an efficient search for the best derivation over (exponentially) many candidates. For example, a parser aims to find the best syntactic tree for a given sentence among all derivations under a grammar, and a machine translation (MT) decoder explores the space of all possible translations of the source-language sentence. In these cases, ...

متن کامل

Fuzzy class rescoring: a part-of-speech language model

Current speech recognition systems usually use word-based trigram language models. More elaborate models are applied to word lattices or N best lists in a rescoring pass following the acoustic decoding process. In this paper we consider techniques for dealing with class-based language models in the lattice rescoring framework of our JANUS large vocabulary speech recognizer. We demonstrate how t...

متن کامل

Direct word graph rescoring using a* search and RNNLM

The usage of Recurrent Neural Network Language Models (RNNLMs) has allowed reaching significant improvements in Automatic Speech Recognition (ASR) tasks. However, to take advantage of their capability for considering long histories, they are usually used to rescore the N-best lists (i.e. it is in practice not possible to use them directly during acoustic trellis search). We propose in this pape...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007